Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: era get_block_slot_indexes() not working #1412

Merged
merged 5 commits into from
Sep 6, 2024

Conversation

KolbyML
Copy link
Member

@KolbyML KolbyML commented Sep 2, 2024

What was wrong?

  • get_block_slot_indexes() apparently it is the 0ed offset, from the start of the slot_index not 0

How was it fixed?

implemented get_block_slot_indexes() to work based 0ed offset from the start of slot_index

@KolbyML KolbyML changed the title fix: era get_block_slot_indexes() not working + off by 1 fix: era get_block_slot_indexes() not working Sep 2, 2024
@KolbyML KolbyML force-pushed the fix-era-support branch 3 times, most recently from 59b3bbd to 5e6f414 Compare September 2, 2024 08:49
@KolbyML KolbyML self-assigned this Sep 2, 2024
if slot_index_block_entry.slot_index.indices.is_empty() {
return vec![];
}
// minus 8 because the first 8 bytes are the starting slot
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's minus 8 bytes for the first "entry", not "slot", right?
That should be VersionEntry, if I'm not wrong. Should we add something like SERIALIZED_SIZE to it (like we have for Header)?

Copy link
Member Author

@KolbyML KolbyML Sep 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
starting_slot is 8 bytes,

This has nothing to do with an VersionEntry entry to my understanding

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image
Reading it again I am so confused, but I know it is 8 bytes

return vec![];
}
// minus 8 because the first 8 bytes are the starting slot
let beginning_of_index_record = slot_index_block_entry.slot_index.indices[0] - 8;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this logic doesn't seem right for me. Maybe I'm missing something, so if you could clarify:

  1. with current logic, first slot will always be occupied (because check at line 124 will always be false). I don't think that's correct.
  2. are we sure that indices[0] can't be lower than 8 (in which case we would panic)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with current logic, first slot will always be occupied (because check at line 124 will always be false). I don't think that's correct.

The first index isn't an offset of 0, it is an offset of 8 because starting_slot

are we sure that indices[0] can't be lower than 8 (in which case we would panic)?

Yes because it will always be 8 bytes higher than starting_slot

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked more into this, and after debugging bytes and looking for multiple example, I think I figured it out.


First, the indices are negative integers and we should update indices field in both SlotIndexBlock and SlotIndexState to [i64; X]. It even says so in the spec:

A SlotIndex record may appear in a stand-alone file which by convention ends with .e2i - in this case, the offset is counted as if the index was appened to its corresponding data file - offsets are thus negative and counted from the end of the data file. In particular, if the index is simply appended to the data file, it does not change in contents.


Second, if index (offset) point to the beginning of file (meaning index == position_of_slot_index_block), then that block is skipped.
In most cases, first block is present, and will be 8 bytes from the start of file (VersionEntry takes first 8 bytes), resulting in your math working but with the wrong reasoning (it's not because of the starting_slot but because of the VersionEntry).
However, if first block is skipped (if I'm not mistaken, epoch 819 has starting slot 6701056 which is empty), this will not work, because first block already points to the beginning of file and we don't need to subtract 8 bytes.


Now, the solution. For this function to work, we need to know at what byte the SlotIndexBlock start in the original file, and we have 2 ways of doing so:

  1. sum encoded sizes of all entries up to this point (version + blocks + state)
  2. know length of the file and subtract encoded sizes of SlotIndexBlock and SlotIndexState entries (which are fixed: 8192*8 + 3*8 = 65560 and 4*8=32)

I think it's a bit hard to explain this in words, but I tried my best. If something is not clear, we can have a call and discuss it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I choose the 2nd option

@KolbyML
Copy link
Member Author

KolbyML commented Sep 5, 2024

@morph-dev ready for another review

file_length: usize,
slot_index_block_entry: &SlotIndexBlockEntry,
) -> Vec<u64> {
// To calculate the beginning of the index record, we need to subtract:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these should be constants in various places. For example:

  • SlotIndexBlock should have something like:

    SERIALIZED_SIZE: usize = 8 * (1 + SLOTS_PER_HISTORICAL_ROOT + 1);

    which will be used in impl TryFrom<&Entry> for SlotIndexBlockEntry

  • SlotIndexBlockEntry should also have

    SERIALIZED_SIZE: usize = Header::SERIALIZED_SIZE + SlotIndexBlock::SERIALIZED_SIZE;

Similar for SlotIndexStateEntry.

// from the total file length.
// Then subtract the result from u64::MAX to get the starting slot of the index record then
// add 1.
let beginning_of_index_record = u64::MAX - (file_length as u64 - (65560 + 32)) + 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indices/offsets should really be i64. Combined with comment above, this would be:

let beginning_of_index_record = file_length - SlotIndexBlockEntry::SERIALIZED_SIZE - SlotIndexStateEntry::SERIALIZED_SIZE;
let beginning_of_file_offset = -(beginning_of_index_record as i64);

And later you would just check if *offset != beginning_of_file_offset {

@KolbyML
Copy link
Member Author

KolbyML commented Sep 5, 2024

@morph-dev ready for another review

@@ -271,7 +288,9 @@ impl TryInto<Entry> for SlotIndexBlockEntry {
fn try_into(self) -> Result<Entry, Self::Error> {
let mut buf: Vec<u64> = vec![];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: instead of collecting all of these into Vec<u64>, than converting each into [u8;8] and than each of those into Vec<u8>, and finally collecting all of them into one Vec<u8>, I think we can do something like this:

let mut buf = vec![];

buf.extend_from_slice(&self.slot_index.starting_slot.to_le_bytes());
for index in &self.slot_index.indices {
    buf.extend_from_slice(&index.to_le_bytes());
}
buf.extend_from_slice(&self.slot_index.count.to_le_bytes());

Ok(Entry::new(0x3269, buf))

Considering that there is likely no big gain in performance, and my suggestion might be a bit less readable, it's up to you.

Also below.

Copy link
Member Author

@KolbyML KolbyML Sep 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is better 👍

@KolbyML KolbyML merged commit 5636cc6 into ethereum:master Sep 6, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants